Traditional surveys versus ecological momentary assessments: Digital citizen science approaches to improve ethical physical activity surveillance among youth

The role of physical activity (PA) in minimizing non-communicable diseases is well established. Measurement bias can be reduced via ecological momentary assessments (EMAs) deployed via citizen-owned smartphones. This study aims to engage citizen scientists to understand how PA reported digitally by retrospective and prospective measures varies within the same cohort. This study used the digital citizen science approach to collaborate with citizen scientists, aged 13–21 years over eight consecutive days via a custom-built app. Citizen scientists were recruited through schools in Regina, Saskatchewan, Canada in 2018 (August 31—December 31). Retrospective PA was assessed through a survey, which was adapted from three validated PA surveys to suit smartphone-based data collection, and prospective PA was assessed through time-triggered EMAs deployed consecutively every day, from day 1 to day 8, including weekdays and weekends. Data analyses included paired t-tests to understand the difference in PA reported retrospectively and prospectively, and linear regressions to assess contextual and demographic factors associated with PA reported retrospectively and prospectively. Findings showed a significant difference between PA reported retrospectively and prospectively (p = 0.001). Ethnicity (visible minorities: β = - 0.911, 95% C.I. = -1.677, -0.146), parental education (university: β = 0.978, 95% C.I. = 0.308, 1.649), and strength training (at least one day: β = 0.932, 95% C.I. = 0.108, 1.755) were associated with PA reported prospectively. In contrast, the number of active friends (at least one friend: β = 0.741, 95% C.I. = 0.026, 1.458) was associated with retrospective PA. Physical inactivity is the fourth leading cause of mortality globally, which requires accurate monitoring to inform population health interventions. In this digital age, where ubiquitous devices provide real-time engagement capabilities, digital citizen science can transform how we measure behaviours using citizen-owned ubiquitous digital tools to support prevention and treatment of non-communicable diseases.


Introduction
Physical activity is an important protective factor that can prevent or minimize non-communicable diseases such as diabetes mellitus, cancer, obesity, hypertension, and joint conditions [1][2][3][4].However, measuring physical activity (PA) can be plagued with challenges and inaccuracies, such as over-reporting of PA [5], recall bias [6], lack of environmental and social context of PA, and difficulty in reporting PA in the form of intensities (e.g., moderate, and vigorous activities) [7][8][9].The continued understanding of how PA is accumulated, and its accurate measurement, is crucial in identifying patterns of PA, and to accurately monitor population adherence to PA recommendations [10,11].
Traditionally, retrospective means of measurement have been used to measure PA accumulation [12,13].Retrospective surveys in general are easy to implement and are not resource intensive [14], however they tend to overestimate PA [15,16], which can be attributed to recall biases [8,17].For instance, in a study carried out on lower back pain patients and healthy controls, it was determined that retrospective surveys overestimated self-reported moderate physical activity by 42min/day, and vigorous activity by 39min/day [18].Objective measures of PA, such as the use of accelerometers, global positioning system, heart rate monitoring, and movement sensors, can solve the problem of recall bias present in retrospective subjective questionnaires.However, objective measures can be time consuming [19], expensive [20], and logistically challenging to implement across populations [21][22][23].
Advances in information and communication technology offer novel opportunities in PA measurement [24,25].Ubiquitous tools such as smartphones can enable ecological momentary assessments (EMAs) to be deployed via smartphones in near real-time, and with more frequency by using time-, location-, and user-triggers, which provide flexibility for both researchers and study participants [26,27].EMAs assess participants' experience/behaviour in realtime, and in the real-world, where researchers use sampling and monitoring strategies to assess phenomena as they occur in natural settings [17].More recently, the use of EMAs in assessing PA behaviour using citizen-owned smartphones has gained momentum among researchers [26][27][28], due to its ability to eliminate recall [23], and social desirability biases [29] that are inherent in retrospective subjective PA surveys [30,31].
This study aims to ascertain if there is a significant difference between the duration of PA reported retrospectively using traditional validated surveys, and duration of PA reported prospectively using EMAs, within the same cohort of participants.In addition, this study assesses contextual and demographic factors that are associated with duration of PA reported retrospectively vs. prospectively (EMAs) within the same cohort of participants.

Study design
This study combined cross-sectional validated survey measures and longitudinal EMAs to engage with the same cohort of youth citizen scientists in an urban centre (Regina) in the Canadian prairie province of Saskatchewan.The study captures PA behaviours, and its related factors from youth who participated in the Smart Platform as youth citizen scientists [26].The Smart Platform is a citizen science and digital epidemiological initiative for population health surveillance, knowledge translation, and real-time interventions [32,33].It combines participatory, community-based, and citizen science approaches to leverage citizen-owned smartphones to ethically engage citizen scientists for population health research.The research ethics approval for the Smart Platform was approved by the Research Ethics Boards of the Universities of Regina and Saskatchewan (REB # 2017-029).
The Smart Platform enabled our research team to use a custom-built smartphone application (app) to engage with citizen scientists [26,27] over eight consecutive days [34].Youth citizen scientists had the option to download the app from both the iOS and Android platforms onto their smartphones.Using the app, apart from PA data, a wide range of behavioural, contextual, demographic, social factors were reported by youth [26,27,33].This study used the following data that were derived using surveys deployed via the app [25,26]: family and peer support for PA; sociodemographic characteristics; and individual characteristics that determined overall PA, such as strength training (S1 Fig).

Participants
A sample size calculation at 90% confidence level, with 5% margin of error resulted in the required sample size of 273.A total of 808 youth citizen scientists (13-21 years) were recruited through Regina Public and Catholic Schools engagement sessions held in various high schools in Regina, Saskatchewan, Canada between August 31st, and December 31st, 2018.Citizen scientists were recruited through a collaborative effort between the school administrators and the research team.Scheduled in-person recruitment sessions were organized between the research team and the youth.Activities during the recruitment sessions included describing the study to youth, demonstration of how to use the app, answering queries and concerns, and assisting youth in downloading the app onto their respective smartphones.All youth participants of the study provided informed consent through the app.For youth participants between the ages of 13-16 years, implied informed consent was obtained from their caregivers and parents before the scheduled recruitment sessions.

Measures
PA (dependent variables).On day one of the study, using a time-triggered smartphone nudge, retrospective PA data (over previous 7 days) were collected from youth through a survey adapted for smartphone deployment from three validated self-reported measures: the international physical activity questionnaire, the simple physical activity questionnaire, and the global physical activity questionnaire [35][36][37][38][39].The adaptation allowed youth to report their PA accumulation in the previous seven days before joining the study by clearly listing each of the previous seven days, irrespective of when they joined the study (S2 Fig) .For instance, if a youth joined the study on August 31, the app would prompt them to report PA on August 30,29,28,27,26,25,and 24.As part of this adaptation, PA was defined (Table 1) and youth were then asked to report minutes spent on PA activity, which is similar to how PA accumulation was assessed in the three validated surveys.The modification allowed time-triggered digital deployment of the retrospective survey and accommodated the varying start dates of youth joining the study (S2 Fig) .From these responses, mean overall PA per day (will be referred to as: retrospective PA duration) was derived.Following general PA data derivation standards [40,41], youth who reported less than 10 minutes or more than 960 minutes (about 16 hours) per day were excluded from the analyses [41,42].
Prospective PA information was obtained via daily time triggered EMAs throughout the study period (eight consecutive days), to include both weekdays and weekends.The EMAs were deployed every evening between 8:00 PM to 11: 30 PM and were set to expire at midnight [26].EMAs used skip-pattern questions to capture PA accumulation (S3 Fig) .After defining what constitutes PA (Table 1), EMAs asked youth the following questions: 1) "What type of physical activities did you do today?"(Multiple choice); 2) "How many minutes did you spend doing this activity?"(Open ended).From these questions, mean PA per day was derived (will be referred to as: EMA PA duration).Both retrospective and prospective PA were the dependent variables used in this study.
Family support for PA (independent variables).Family support for PA was captured using one question: "How much do your parents, stepparents, or guardians support you in being physically active?(e.g., driving you to team games, buying you sporting equipment)" with the 4 response options: "very supportive", "supportive", "unsupportive", or "very unsupportive".Using these data, we collapsed the responses into: "unsupportive" (combining unsupportive and very unsupportive), "supportive" and "very supportive".
Peer support for PA (independent variables).Youth were asked to think about their closest friends in the last 12 months when answering the question regarding peer support of PA.Peer support for PA was captured with the question: "How many of your closest friends are physically active?" with the six response options: "none of my friends", "1", "2", "3", "4", or "5 of my friends".This variable was dichotomized into "0 physically active friends" corresponding to "none of my friends" and "at least 1 active friend" corresponding to "1", "2", "3", "4" or "5 of my friends".Sociodemographic covariates.Gender was ascertained with the question, "What is your gender?", with 5 response options: "male", "female", "transgender", "other (please specify)", and "prefer not to disclose".The responses "transgender", "other", and "prefer not to disclose" were collapsed into one category due to low counts within these categories.Parental education was measured by asking youth to report the "highest education" of one of their parents or guardians, with six response options: "elementary school", "some secondary/high school", "completed high school", "some post-secondary (university/college)", "received university or college degree /diploma", and "does not apply".From these responses, four categories of parental education were derived: 1) "elementary school" corresponds to "elementary school or below", 2) "some secondary/high school" and "completed high school" corresponds to "at least secondary school" 3) "some post-secondary (university /college)", "received university or college degree/diploma" corresponds to "university and above" and "does not apply".Youth citizen scientists were also asked about their ethnicity, with the following response options: "First Nations", "Dene", "Cree", "Metis", "Inuit", "African", "Asian", "Canadian", "Caribbean/West Indian", "Eastern European", "European", "South Asian", "other", and "Mixed".From these responses, four categories were extracted: 1) "Indigenous" which corresponds to "First Nations", "Dene", "Cree", "Metis", "Inuit", 2) "Canadian", 3) "mixed" and 4) "visible minorities".The visible minorities include "African", "Asian", "Caribbean/West Indian", "Eastern European", "European", "South Asian", and "other" categories.The visible minorities category was created due to low count within these ethnic categories.

Data and risk management
To ensure confidentiality, all data were encrypted before being streamed to a secure cloud server.Identifiable artifacts (e.g., photos, voice recordings) were removed or de-identified before the data were analyzed.A permission built into the app restricted access to personally identifiable information (e.g., contact list or network visited).Media Access Control address anonymization was used to protect youth citizen scientists' data based on a simple hash algorithm.Risks and privacy management options were made clear to youth citizen scientists while obtaining informed consent.In addition, citizen scientists not only had the option to drop out of the study or pause data gathering anytime they wished, but also had the option to upload data only when they had Wi-Fi access and /or when their smartphones were plugged into a power source.Together with the above features, youth citizen scientists also had the option to drop out of the study at any point of time [27]

Data analysis
Descriptive statistics, such as frequencies and percentages were used in describing the independent variables of this study.Paired sample t-test inferential statistics was used to ascertain difference between mean minutes of PA reported via retrospective PA survey and mean minutes of PA reported via EMAs.Multiple linear regression models were used to assess factors associated with mean minutes of PA reported retrospectively and prospectively (EMAs), while adjusting for control variables.Data analyses were conducted using R 4.2.1 statistical tool.A significance level was set at p < 0.05.

Results
A total of 808 youth citizen scientists (13-21 years) were recruited for this study.After excluding participants who did not report on primary dependent and independent variables, the final sample size of this study was 436.
Table 2 presents the summary statistics of youth citizen scientists.Youth were predominantly females (55.8%), with 38.5% being males and 5.7% reporting one of following categories: transgender, other, or preferred not to disclose.Majority of youth identified themselves as Canadian (39.8%), followed by mixed (29.7%),visible minority (25.5%), and Indigenous (5%).In terms of socioeconomic status, most youth (65.1%) reported that one of their parents had a university degree.In terms of strength training, 76.5% of youth reported having at least one day of strength training, while 23.5% reported having zero days of strength training.
In terms of social context/support for PA, 88.7% reported having at least one or more physically active friends, while 45.3% reported that parents/guardians are supportive of their PA, and 42.9% reported that parents/guardians are very supportive of their PA.
The summary statistics of the dependent variables: retrospective PA and prospective PA EMAs were presented in Table 3.The mean time spent on PA per day (in minutes) reported via the retrospective PA survey and prospective EMAs were 93 and 196, respectively.The paired sample t-test result of 3.237 (p = 0.001) suggests a statistically significant difference between mean duration of PA reported by youth retrospectively and prospectively (EMAs).
The adjusted, linear regression models showing the relationship between (EMA PA [model 1] and Retrospective PA [model 2]), and contextual and demographic factors are presented in Table 4.
In the EMA model (i.e., prospective PA: model 1), youth whose ethnicity was categorized as visible minorities reported less PA (β = -0.911,95% confidence interval [C.I.] = -1.677,-0.146, p-value = 0.024) in comparison with youth whose ethnicity was Canadian.This association was not found to be statistically significant in the retrospective PA model (i.e., retrospective PA: model 2).Similarly, youth who reported at least one parent having a university degree accumulated more EMA PA (β = 0.978, 95% [C.I.] = 0.308, 1.649, p-value = 0.006) in comparison with youth who reported that their parents had at least secondary school education.This association was not found to be statistically significant in the retrospective PA model.Following the same pattern, youth who engaged in at least one day of strength training reported more PA via EMAs (β = 0.932, 95% [C.I.] = 0.108, 1.755, p-value = 0.031) in comparison with youth who reported zero days of strength training.This association was not found to be statistically significant in the retrospective PA model.
In the retrospective PA model, youth who reported having at least one friend who is physically active were significantly associated with more PA (β = 0.741, 95% [C.I.] = 0.026, 1.458, pvalue = 0.048) in comparison to youth who reported having zero physically active friends.This association was not found to be statistically significant in the EMA model.

Discussion
This study was conducted by engaging youth citizen scientists using their own smartphones utilizing a methodology that integrates ethical population health surveillance, integrated knowledge translation, and real-time behavioural interventions [32].The primary purpose of this study was to ascertain if there was a significant difference between the duration of PA reported retrospectively using a modified version of three validated PA questionnaires and duration of PA reported prospectively using EMAs within the same cohort of youth aged 13-21 years.In addition, the study also assessed sociodemographic and contextual factors that are associated with duration of PA reported retrospectively vs. prospectively.Total (n = 415) a 100 a Some youth did not provide a response to this question. https://doi.org/10.1371/journal.pdig.0000294.t002 The primary finding was that there was a significant difference in PA reported retrospectively in comparison with PA reported prospectively using EMAs, with youth reporting more PA using prospective EMAs.However, in a similar study, although carried out on a cohort of adult participants [26], more PA was reported using a validated retrospective survey in comparison to EMAs.Current evidence indicates that there are differences in how adults and youth accumulate PA [44], with youth engaging in more frequent habitual activities [45], resulting in intermittent PA accumulation [45].
Prospective EMAs could be better suited to capture such intermittent PA accumulation because of the ability of EMAs to be self-or time-triggered [27].Moreover, as this study was implemented via ubiquitous digital tools (i.e., smartphones), it is important to consider the level of digital literacy of youth, which is known to be higher than adults [46,47], a fact that could have played a role in youth reporting PA more accurately using EMAs in comparison with adults.Having said that, the nature of recall can also play a significant role in exacerbating or reducing inaccurate reporting [48].In this study, the adaptation of the retrospective survey allowed youth to clearly report PA on each of the previous seven days, irrespective of when they joined the study, an approach that should have improved recall.Thus, the significant overreporting of PA using EMAs among youth needs further exploration with prospective studies.EMA deployment should consider digital literacy of the participants to ensure that the methods of EMA deployment align with the level of digital literacy.In our study, we used citizen science approaches to engage with youth before EMA deployment to ensure that they can report PA with ease [26].This engagement included relationship building with the schools and organizing presentation sessions, where youth had the opportunity to ask questions and even suggest potential changes to EMA deployment.For instance, although before deployment EMAs were set up to expire within an hour, after feedback from youth citizen scientists during the engagement sessions, we increased the time of expiry to maximize daily PA reporting opportunities [27] As for the sociodemographic and contextual factors associated with duration of PA, this study indicates that there are several differences in associations between duration of PA reported retrospectively vs. prospectively.The retrospective PA model (Table 4: Model 2) depicted one significant association, where youth who reported having at least one physically active friend also reported more minutes of PA.This finding is consistent with previous quantitative and qualitative PA studies [49][50][51], where peer support was found to increase PA of youth.The EMA model (Table 4: Model 1) depicted three significant associations.Youth who reported at least one parent having a university degree in comparison with youth whose parents have high school or lower education, and youth who engaged in at least one day of strength training in comparison to youth who reported zero days of strength training reported more minutes of PA.Although not many studies have been carried out to ascertain factors that are associated with PA duration reported using EMAs by youth, these findings are consistent with existing evidence [52][53][54][55].Finally, the EMA model also showed that visible minority youth reported lower duration of PA in comparison with youth who identified themselves as Canadian, a finding that is consistent with previous studies that have examined differences of PA among ethnicities [56,57].
More importantly, although our findings are consistent with existing evidence, the key finding here is the difference in sociodemographic and contextual factors that were associated with duration of PA reported retrospectively vs. prospectively.If we are to develop appropriate interventions to address global physical inactivity [58][59][60], it is critical to accurately understand the factors that determine PA accumulation.A clear difference between factors that are associated with PA reported retrospectively vs. prospectively by the same cohort of individuals shows that further investigation is needed to understand physical activity measurement, particularly in the digital age, where ubiquitous tools are available to obtain data in real-time [26,27,61].
There is considerable evidence that prospective EMAs are effective measures in estimating determinants and correlates of PA.Moreover, in real-time and real-world settings, their validity and reliability in measuring PA have been established [62,63].EMAs reduce participant burden by using digital reminders/nudges that can be triggered on participants' smartphones based on time, and location.EMAs can also be self-triggered by participants, which provides them the capacity to provide information that is tailored to their needs and circumstances [64,65].EMAs are also known to reduce recall bias since participants do not need to recall their behaviours [17], a significant factor in improving PA measurement in real-world settings.
It is important to note that EMAs can transform how data are collected in the digital age, because they can be completed at participants' convenience, and in collaboration between the researchers and the participants i.e., digital citizen science [26,66].It is also important to consider the age cohort involved in data collection i.e., EMAs are more appealing to young participants with greater digital literacy [46,63].
Evidence also indicates that EMAs provide ecological validity of whether associations are significant in relation to typical settings of everyday life [67,68].PA measurement using EMAs provide context, improves data validity through reduction of recall bias and data entry errors as participants are not required to retrospectively recall their behaviours [69,70].EMAs reduce participant burden by using digital reminders/ nudges that can be triggered on participants' phones based on predefined time and location [69,71].In addition, EMAs can also be self-triggered by participants, which allows them some level of personalization to their needs [69,72].One clear indication is that in the digital age, where smartphone usage is almost universal [73,74], it is critical to further explore usage of digital EMAs to capture PA across populations.This exploration is especially important due to PA's role in minimizing non-communicable diseases [1,75].As global PA patterns are consistently reported to educate the public, and to inform policies to prevent non-communicable diseases [76][77][78][79][80][81], and as it is important to accurately capture PA patterns, digital citizen science could play an important role in ethical surveillance of PA [27].

Strengths and limitations
Strengths of this study include further enhancing our understanding regarding digital tools and methodologies in reporting health behaviours.It will contribute to new evidence given the digitization of the surveys themselves, but also because they capture prospective data at participants' convenience.Such data can be collected by researchers in real-time, who can adjust surveys if needed or send prompts in real-time.Social desirability bias is a limitation as respondents could provide answers that are viewed as favorable.Our study compared seven days prospective PA assessments with seven days retrospective PA assessment, a relatively short period that needs to be increased in future longitudinal studies to validate the findings.Finally, although the study can be construed to have a Hawthorne effect [82]-a situation where participants can change their behaviour due to awareness of being observed, there is considerable evidence that participants do not necessarily change their behaviour in observational studies [82,83,84].

Conclusion
Physical inactivity is the fourth leading cause of mortality globally, and it is critical to understand patterns of PA using rigorous and validated tools.The findings of this study show the importance of using prospective EMAs to capture PA, which is particularly relevant in the digital age, where ubiquitous devices provide us with real-time engagement capabilities.More importantly, digital citizen science can transform how we measure behaviours using citizenowned ubiquitous digital tools to support prevention and treatment of non-communicable diseases.